Data augmentation is a widely used technique for enhancing the generalization ability of convolutional neural networks (CNNs) in image classification tasks. Occlusion is a critical factor that affects on the generalization ability of image classification models. In order to generate new samples, existing data augmentation methods based on information deletion simulate occluded samples by randomly removing some areas in the images. However, those methods cannot delete areas of the images according to their structural features of the images. To solve those problems, we propose a novel data augmentation method, AdvMask, for image classification tasks. Instead of randomly removing areas in the images, AdvMask obtains the key points that have the greatest influence on the classification results via an end-to-end sparse adversarial attack module. Therefore, we can find the most sensitive points of the classification results without considering the diversity of various image appearance and shapes of the object of interest. In addition, a data augmentation module is employed to generate structured masks based on the key points, thus forcing the CNN classification models to seek other relevant content when the most discriminative content is hidden. AdvMask can effectively improve the performance of classification models in the testing process. The experimental results on various datasets and CNN models verify that the proposed method outperforms other previous data augmentation methods in image classification tasks.
translated by 谷歌翻译
深度神经网络具有强大的功能,但它们也有缺点,例如它们对对抗性例子,噪音,模糊,遮挡等的敏感性。先前提出了许多以前的工作来提高特定的鲁棒性。但是,我们发现,在神经网络模型的额外鲁棒性或概括能力的牺牲下,通常会提高特定的鲁棒性。特别是,在改善对抗性鲁棒性时,对抗性训练方法在不受干扰的数据上严重损害了对不受干扰数据的概括性能。在本文中,我们提出了一种称为AugRmixat的新数据处理和培训方法,该方法可以同时提高神经网络模型的概括能力和多重鲁棒性。最后,我们验证了AUGRMIXAT对CIFAR-10/100和Tiny-Imagenet数据集的有效性。实验表明,Augrmixat可以改善模型的概括性能,同时增强白色框的鲁棒性,黑盒鲁棒性,常见的损坏鲁棒性和部分遮挡鲁棒性。
translated by 谷歌翻译
近年来,由于深度神经网络的发展,面部识别取得了很大的进步,但最近发现深神经网络容易受到对抗性例子的影响。这意味着基于深神经网络的面部识别模型或系统也容易受到对抗例子的影响。但是,现有的攻击面部识别模型或具有对抗性示例的系统可以有效地完成白色盒子攻击,而不是黑盒模仿攻击,物理攻击或方便的攻击,尤其是在商业面部识别系统上。在本文中,我们提出了一种攻击面部识别模型或称为RSTAM的系统的新方法,该方法可以使用由移动和紧凑型打印机打印的对抗性面膜进行有效的黑盒模仿攻击。首先,RSTAM通过我们提出的随机相似性转换策略来增强对抗性面罩的可传递性。此外,我们提出了一种随机的元优化策略,以结合几种预训练的面部模型来产生更一般的对抗性掩模。最后,我们在Celeba-HQ,LFW,化妆转移(MT)和CASIA-FACEV5数据集上进行实验。还对攻击的性能进行了最新的商业面部识别系统的评估:Face ++,Baidu,Aliyun,Tencent和Microsoft。广泛的实验表明,RSTAM可以有效地对面部识别模型或系统进行黑盒模仿攻击。
translated by 谷歌翻译
图像转换是一类视觉和图形问题,其目标是学习输入图像和输出图像之间的映射,在深神网络的背景下迅速发展。在计算机视觉(CV)中,许多问题可以被视为图像转换任务,例如语义分割和样式转移。这些作品具有不同的主题和动机,使图像转换任务蓬勃发展。一些调查仅回顾有关样式转移或图像到图像翻译的研究,所有这些都只是图像转换的一个分支。但是,没有一项调查总结这些调查在我们最佳知识的统一框架中共同起作用。本文提出了一个新颖的学习框架,包括独立学习,指导学习和合作学习,称为IGC学习框架。我们讨论的图像转换主要涉及有关深神经网络的一般图像到图像翻译和样式转移。从这个框架的角度来看,我们回顾了这些子任务,并对各种情况进行统一的解释。我们根据相似的开发趋势对图像转换的相关子任务进行分类。此外,已经进行了实验以验证IGC学习的有效性。最后,讨论了新的研究方向和开放问题,以供将来的研究。
translated by 谷歌翻译
信道修剪中最有效的方法之一是根据每个神经元的重要性来修剪。然而,测量每个神经元的重要性是NP难题。以前的作品通过考虑单层或多个连续的神经元层的统计来修剪。这些作品无法消除不同数据对重建错误模型的影响,并且目前没有工作证明参数的绝对值可以直接用作判断权重的重要性的基础。一种更合理的方法是消除准确测量影响力的批量数据之间的差异。在本文中,我们建议使用集合学习来培训不同批量数据的模型,并使用影响功能(来自强大的统计数据的经典技术)来学习算法跟踪模型的预测并返回其训练参数梯度,使其返回其训练参数梯度,使其返回其培训参数梯度,使其返回其培训参数梯度,使其返回其培训参数梯度,使其返回其训练参数梯度我们可以在预测过程中确定我们称之为“影响”的每个参数的责任。此外,我们理论上证明了深度网络的后传播是权重的影响函数的一阶泰勒近似。我们执行广泛的实验,以证明使用集合学习的思想基于影响功能的修剪将比仅关注误差重建更有效。 CIFAR的实验表明,影响修剪达到最先进的结果。
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
Nowadays, time-stamped web documents related to a general news query floods spread throughout the Internet, and timeline summarization targets concisely summarizing the evolution trajectory of events along the timeline. Unlike traditional document summarization, timeline summarization needs to model the time series information of the input events and summarize important events in chronological order. To tackle this challenge, in this paper, we propose a Unified Timeline Summarizer (UTS) that can generate abstractive and extractive timeline summaries in time order. Concretely, in the encoder part, we propose a graph-based event encoder that relates multiple events according to their content dependency and learns a global representation of each event. In the decoder part, to ensure the chronological order of the abstractive summary, we propose to extract the feature of event-level attention in its generation process with sequential information remained and use it to simulate the evolutionary attention of the ground truth summary. The event-level attention can also be used to assist in extracting summary, where the extracted summary also comes in time sequence. We augment the previous Chinese large-scale timeline summarization dataset and collect a new English timeline dataset. Extensive experiments conducted on these datasets and on the out-of-domain Timeline 17 dataset show that UTS achieves state-of-the-art performance in terms of both automatic and human evaluations.
translated by 谷歌翻译
In this tutorial paper, we look into the evolution and prospect of network architecture and propose a novel conceptual architecture for the 6th generation (6G) networks. The proposed architecture has two key elements, i.e., holistic network virtualization and pervasive artificial intelligence (AI). The holistic network virtualization consists of network slicing and digital twin, from the aspects of service provision and service demand, respectively, to incorporate service-centric and user-centric networking. The pervasive network intelligence integrates AI into future networks from the perspectives of networking for AI and AI for networking, respectively. Building on holistic network virtualization and pervasive network intelligence, the proposed architecture can facilitate three types of interplay, i.e., the interplay between digital twin and network slicing paradigms, between model-driven and data-driven methods for network management, and between virtualization and AI, to maximize the flexibility, scalability, adaptivity, and intelligence for 6G networks. We also identify challenges and open issues related to the proposed architecture. By providing our vision, we aim to inspire further discussions and developments on the potential architecture of 6G.
translated by 谷歌翻译
In this paper, we investigate the joint device activity and data detection in massive machine-type communications (mMTC) with a one-phase non-coherent scheme, where data bits are embedded in the pilot sequences and the base station simultaneously detects active devices and their embedded data bits without explicit channel estimation. Due to the correlated sparsity pattern introduced by the non-coherent transmission scheme, the traditional approximate message passing (AMP) algorithm cannot achieve satisfactory performance. Therefore, we propose a deep learning (DL) modified AMP network (DL-mAMPnet) that enhances the detection performance by effectively exploiting the pilot activity correlation. The DL-mAMPnet is constructed by unfolding the AMP algorithm into a feedforward neural network, which combines the principled mathematical model of the AMP algorithm with the powerful learning capability, thereby benefiting from the advantages of both techniques. Trainable parameters are introduced in the DL-mAMPnet to approximate the correlated sparsity pattern and the large-scale fading coefficient. Moreover, a refinement module is designed to further advance the performance by utilizing the spatial feature caused by the correlated sparsity pattern. Simulation results demonstrate that the proposed DL-mAMPnet can significantly outperform traditional algorithms in terms of the symbol error rate performance.
translated by 谷歌翻译